Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve googleapis use_languages extension tag #3437

Closed
wants to merge 1 commit into from

Conversation

mering
Copy link
Contributor

@mering mering commented Dec 16, 2024

This obeys requested bindings from any module.

@bazel-io
Copy link
Member

Hello @bazelbuild/bcr-maintainers, modules without existing maintainers (googleapis) have been updated in this PR. Please review the changes.

@mering
Copy link
Contributor Author

mering commented Dec 16, 2024

@bazel-io skip_check unstable_url

@bazel-io bazel-io added the skip-url-stability-check Skip the URL stability check for the PR label Dec 16, 2024
@mering mering mentioned this pull request Dec 16, 2024
@mering mering force-pushed the googleapis-use-languages branch 3 times, most recently from 947450d to 6c7eb25 Compare December 16, 2024 18:53
@mering mering changed the title Improve use_languages extension tag Improve googleapis use_languages extension tag Dec 16, 2024
@mering
Copy link
Contributor Author

mering commented Dec 17, 2024

@meteorcloudy @fmeum @keith could you PTAL? Thanks!

fmeum
fmeum previously approved these changes Dec 17, 2024
This obeys requested bindings from any module.
@mering mering force-pushed the googleapis-use-languages branch from 6c7eb25 to 5bda46c Compare December 17, 2024 10:34
@bazel-io bazel-io dismissed fmeum’s stale review December 17, 2024 10:34

Require module maintainers' approval for newly pushed changes.

@mering
Copy link
Contributor Author

mering commented Dec 17, 2024

@fmeum Tests also succeeded for Bazel 8.x.

@fmeum
Copy link
Contributor

fmeum commented Dec 17, 2024

Having thought more about this, I do have some concerns about this. By allowing BCR modules to start using googleapis as is, we are making it more difficult for us to migrate the module to a reasonable structure: It has to be split into a module per language.

@meteorcloudy @Wyverald Is official Bzlmod support for googleapis on the maintainers' roadmap?

@mering
Copy link
Contributor Author

mering commented Dec 17, 2024

Having thought more about this, I do have some concerns about this. By allowing BCR modules to start using googleapis as is, we are making it more difficult for us to migrate the module to a reasonable structure: It has to be split into a module per language.

@meteorcloudy @Wyverald Is official Bzlmod support for googleapis on the maintainers' roadmap?

There are already various BCR modules using googleapis. This basically just allows us to test these targets in the BCR CI as well. If you prefer modules using googleapis without tests on BCR in the meantime please let me know.

@fmeum
Copy link
Contributor

fmeum commented Dec 17, 2024

As far as I understand the module extension logic in the current version, only the root module can enable languages. For languages such as C#, even the root module likely can't do that since googleapis doesn't declare the required bazel_deps.

Presumably this means that any user of a module that transitively uses googleapis needs to include a switched_rules_by_language usage in their root module file. Can you do the same to run your tests in BCR CI? If not, how do your tests differ from what a regular use of this module would look like? I'm pretty sure we can get you decent test coverage somehow without "prod changes".

The problem with module extensions that do something for non-root modules is that they very quickly form public API that is very difficult to change - there is no way to have bazel_features like logic in module files and even max_compatibility_level doesn't allow a module to be compatible with two different versions of a module extension. Since googleapis doesn't even come with upstream support for Bzlmod yet, we would severely limit this future support if we patch in this kind of API now. Based on my experience with googleapis so far, it has been a constant source of big trouble and without a better API than we have right now (including with this PR), we could end up cementing that state even with Bzlmod.

@mering
Copy link
Contributor Author

mering commented Dec 17, 2024

Presumably this means that any user of a module that transitively uses googleapis needs to include a switched_rules_by_language usage in their root module file. Can you do the same to run your tests in BCR CI? If not, how do your tests differ from what a regular use of this module would look like? I'm pretty sure we can get you decent test coverage somehow without "prod changes".

How can I tell the BCR CI to add a call to switched_rules_by_language call to the testing module? The CI doesn't run th tests in the module itself but creates a separated testing module only depending on the uploaded module for tests.

The problem with module extensions that do something for non-root modules is that they very quickly form public API that is very difficult to change - there is no way to have bazel_features like logic in module files and even max_compatibility_level doesn't allow a module to be compatible with two different versions of a module extension. Since googleapis doesn't even come with upstream support for Bzlmod yet, we would severely limit this future support if we patch in this kind of API now. Based on my experience with googleapis so far, it has been a constant source of big trouble and without a better API than we have right now (including with this PR), we could end up cementing that state even with Bzlmod.

Where is the difference to requiring this in the root module? In both cases, one needs to configure the extension. Let's assume in the future the call to switched_rules_by_language is no longer needed for example. So one could just make it a noop. Let's assume there is any other way to enable languages, then one can just call the other function to enable the languages from the previous extension. So the only public API is the use_languages extension tag which already exists before this PR.

@fmeum
Copy link
Contributor

fmeum commented Dec 17, 2024

It's quite possible that the new "API" will be that there are separate googleapi-<lang> modules that one needs to explicitly depend on. That's not a change that can be performed in a backwards compatible way. If only root modules need to change, that's a very manageable change, if it transitively breaks BCR modules, that will result in far more pain.

I'm not fully against making this change, but I would like to give the maintainers of the upstream repo a say in this.

@mering
Copy link
Contributor Author

mering commented Dec 17, 2024

It's quite possible that the new "API" will be that there are separate googleapi-<lang> modules that one needs to explicitly depend on. That's not a change that can be performed in a backwards compatible way. If only root modules need to change, that's a very manageable change, if it transitively breaks BCR modules, that will result in far more pain.

I'm not fully against making this change, but I would like to give the maintainers of the upstream repo a say in this.

This would also break transitive modules with the current version as they require the targets to exist. When they are moved to separate modules, those transitive dependency modules would equally break.
Currently, only the root module triggers the generation of the language specific bindings while other modules can already use those generated bindings.

@fmeum
Copy link
Contributor

fmeum commented Dec 17, 2024

This would also break transitive modules with the current version as they require the targets to exist. When they are moved to separate modules, those transitive dependency modules would equally break.

That's not necessarily the case: We could arrange this in a way where users depending on googleapis-java communicates to googleapis via a module extension that the Java variant is available. googleapis could then forward the Java targets that currently exist to the Java "overlay" through aliases. In this way, we could maintain backwards compatibility with transitive usages.

You are right that the module extension could become a noop in the future. I don't enough about googleapis to determine whether the contents of com_google_googleapis_imports are relevant for other modules. If they are and this change would make it possible to control the repo's contents, that would create new concerns as a true noop implementation would no longer do it. Do you know what it's used for?

@mering
Copy link
Contributor Author

mering commented Dec 17, 2024

I don't enough about googleapis to determine whether the contents of com_google_googleapis_imports are relevant for other modules. If they are and this change would make it possible to control the repo's contents, that would create new concerns as a true noop implementation would no longer do it. Do you know what it's used for?

I don't know what exactly it is used for but I know that a depending module doesn't need the use_repo() to com_google_googleapis_imports and can still use the language specific proto bindings. So this seems to be used internally only.

@mering
Copy link
Contributor Author

mering commented Dec 18, 2024

I just want to clarify that this change:

  • does not change the way how non-root modules can use googleapis
  • mainly relieves the burden for the root module to specify desired googleapis languages for transitive dependencies
  • does not break any existing usages

@Wyverald
Copy link
Member

Sorry for the late reply.

@meteorcloudy @Wyverald Is official Bzlmod support for googleapis on the maintainers' roadmap?

Unfortunately not. The googleapis team at Google aren't very interested in the Bazel use case.

I don't enough about googleapis to determine whether the contents of com_google_googleapis_imports are relevant for other modules.

Nobody should use com_google_googleapis_imports other than googleapis itself. (Maybe we need better visibility mechanisms... gah.)


The short version of my opinion on this PR is that it's in the right direction. The long version follows.

The only parts of the googleapis module that should be consumed publicly are the protos, and several core language APIs (Java, C++, Go, Python). All other language bindings are actually only used by googleapis itself to build the language-specific binding libraries that are then shipped separately.

I had tried before to write an "injection" mechanism, where the targets for e.g. the Java API aren't generated if rules_java hasn't "injected" itself into googleapis, but that has several problems: 1) core rulesets like rules_java would need to add a dependency on googleapis to perform injecion (this wasn't really a problem pre-Bazel 8); 2) the various more exotic APIs (Ruby, TypeScript, etc) named in the current googleapis module are only used by itself, so they're "dev deps" that would never really be "injected"; 3) point 2 means that we could simply just always generate all the core language bindings and have googleapis always depend on the core rulesets, but this becomes slightly more of a problem in Bazel 8+.

So the solution depends on how bad we think 3) is. If it's not a big deal for googleapis to always depend on rules_go, then we could just make the switched_rules extension a no-op (effectively consider all core languages in there to be True and all others to be if googleapis is root). If 3) is bad and we really want a googleapis with basically no depedencies other than protobuf (somewhat unclear how to achieve this since protobuf itself also needs to be split up), we could look for other avenues, but that would be a backwards incompatible change already so doing this PR wouldn't be harmful.

Curious to hear what y'all think. @mering, if we could hold this PR for one more day for thoughts on this, you might end up with an even simpler PR (if we decide to make the extension no-op).

@meteorcloudy
Copy link
Member

meteorcloudy commented Dec 19, 2024

Can we have modules like googleapis_py, googleapis_java, googleapis_cc? They still points to the same source archive, but allows different language APIs to be used with proper dependencies.

@mering
Copy link
Contributor Author

mering commented Dec 19, 2024

Can we have modules like googleapis_py, googleapis_java, googleapis_cc? They still points to the same source archive, but allows different language APIs to be used with proper dependencies.

IIUC consumers would have to adapt to the new modules as a breaking change. The current extension could create these repositories or opt as a noop for the course of the migration. So I don't see any conflicts with this PR.

@fmeum
Copy link
Contributor

fmeum commented Dec 19, 2024

Can we have modules like googleapis_py, googleapis_java, googleapis_cc? They still points to the same source archive, but allows different language APIs to be used with proper dependencies.

I think that there is a simpler solution: We can extend the module extension in this PR to accept not just tags with java = True, but java_library_bzl = "@rules_java//java:java_library.bzl". We can then add googleapis_<lang> modules that consist entirely of very small patches: They just depend on rules_<lang> and forward the appropriate load label to googleapis via the extension.

Each module will be able to get e.g. java_proto_library targets by depending on googleapis-java module in addition to googleapis. This is future proof as we can update the googleapis-java if proto language rules move or protobuf is split up further. While we can't force higher versions of googleapis-java through googleapis (we don't want that dependency edge), we can check its version via the module_ctx API and ask users to update - but that should be rarely needed.

@mering What do you think of this approach? It's not much more effort than your current PR, which could serve as the base.

mainly relieves the burden for the root module to specify desired googleapis languages for transitive dependencies

That's the crucial part for me here: While the PR does not add new API that's hard to maintain, it adds API expectations that we won't be able to meet longterm. It creates the expectation that root modules no longer need to manually configure the switched_rules_by_language extension since their transitive deps take care of it, which we would have to break when migrating to some scheme like the one I described above. This expectation doesn't exist right now, which makes any breaking change less surprising.

@mering
Copy link
Contributor Author

mering commented Dec 19, 2024

mainly relieves the burden for the root module to specify desired googleapis languages for transitive dependencies

That's the crucial part for me here: While the PR does not add new API that's hard to maintain, it adds API expectations that we won't be able to meet longterm. It creates the expectation that root modules no longer need to manually configure the switched_rules_by_language extension since their transitive deps take care of it, which we would have to break when migrating to some scheme like the one I described above. This expectation doesn't exist right now, which makes any breaking change less surprising.

I still think that the root module should not need to take care of its transitive dependencies. Isn't this the whole point of Bzlmod?

@mering What do you think of this approach? It's not much more effort than your current PR, which could serve as the base.

I have no experience with repository rules or module extensions and I don't have capacity to work on more than this PR.

@fmeum
Copy link
Contributor

fmeum commented Dec 19, 2024

I have no experience with repository rules or module extensions and I don't have capacity to work on more than this PR.

No problem. If we end up agreeing on an approach along the lines of #3437 (comment), I can implement it.

the various more exotic APIs (Ruby, TypeScript, etc) named in the current googleapis module are only used by itself, so they're "dev deps" that would never really be "injected"

Is it clear that this will remain so? As soon as there is a googleapis module in the BCR and some languages are not getting it from their respective package managers, having all languages get it from the BCR sounds like the best remaining option.

@mering
Copy link
Contributor Author

mering commented Dec 20, 2024

As this PR doesn't seem to follow a desired approach, I am going to close it.

Feel free to continue the discussion about a more preferable approach here.

@mering mering closed this Dec 20, 2024
@mering mering deleted the googleapis-use-languages branch December 20, 2024 09:41
@fmeum
Copy link
Contributor

fmeum commented Dec 21, 2024

#3472 implements the approach I described in #3437 (comment).

@mering I hope that this addresses all your use cases. After your PR #3470, I could also add support for cc_grpc_library.

@mering
Copy link
Contributor Author

mering commented Dec 21, 2024

#3472 implements the approach I described in #3437 (comment).

@mering I hope that this addresses all your use cases. After your PR #3470, I could also add support for cc_grpc_library.

Thanks for going ahead and put your suggestion into code to make it more understandable. IIUC one would still use the targets from @googleapis//.... So the only difference seems to be to add a dependency googleapis-lang to activate the rules for lang even when not used from that repo. Also when another (transitive) dependency already depends on this, my module wouldn't fail without declaring this explicitly. To me this seems a bit counter intuitive (although you added user warnings to guide people). Can you explain why this is preferable to using the use_language extension (within googleapis module as people use targets from this module) as proposed in this PR?

@fmeum
Copy link
Contributor

fmeum commented Dec 21, 2024

IIUC one would still use the targets from @googleapis//....

That's correct. My favorite setup for googleapis would be separate modules per language where each module googleapis-lang only contains lang_proto_library targets. Without cooperation from upstream, which we don't have, that's probably not realistic to achieve and maintain.

Also when another (transitive) dependency already depends on this, my module wouldn't fail without declaring this explicitly.

This is also correct. I don't think that there is a way to realize "strict deps checking" without moving targets to other repos.

Can you explain why this is prederable to using the use_language extension (within googleapis module as people use targets from this module) as proposed in this PR?

The crucial difference is that the multi-module design ensures that the transitive module dependencies are only those actually needed by the root module and its transitive deps that need googleapis. If you and your deps don't need go_proto_library targets, you don't end up paying the dependency footprint cost of having rules_go in your deps. After protobuf has been split up (which I'm pretty sure will need to happen at some point), this would even allow setups that depend on rules_java but not rules_python.

This is particularly important as we may need to extend the scope of googleapis in the future (think PHP/ruby/rust/...). We could easily add googleapis-lang modules for every language without having to add ever more deps for all users.

@mering
Copy link
Contributor Author

mering commented Dec 22, 2024

The crucial difference is that the multi-module design ensures that the transitive module dependencies are only those actually needed by the root module and its transitive deps that need googleapis. If you and your deps don't need go_proto_library targets, you don't end up paying the dependency footprint cost of having rules_go in your deps. After protobuf has been split up (which I'm pretty sure will need to happen at some point), this would even allow setups that depend on rules_java but not rules_python.

This is particularly important as we may need to extend the scope of googleapis in the future (think PHP/ruby/rust/...). We could easily add googleapis-lang modules for every language without having to add ever more deps for all users.

How big is the overhead of adding a dependency to such a rules module if you don't end up depending on any targets using it?
Is it more than downloading the small repo?

@fmeum
Copy link
Contributor

fmeum commented Dec 22, 2024

How big is the overhead of adding a dependency to such a rules module if you don't end up depending on any targets using it?
Is it more than downloading the small repo?

Yes, in different ways:

  • Depending on modules that themselves have further dependencies (as is the case for most rulesets) makes your module graph deeper, which results in slower cold builds since module files are discovered and downloaded by walking the graph.
  • Some rulesets result in eager fetches due to toolchain registrations (this happened with Protobuf's dependency on rules_rust).
  • It becomes more likely that incompatibilities with new Bazel versions block module updates.
  • New modules add new module extensions, which affects bazel mod performance and can bloat the lockfile.

Ideally all of these will become non-issues over time as the Bzlmod subsystem is optimized further, but I think that we should be remain cautious for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
skip-url-stability-check Skip the URL stability check for the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants